109 research outputs found
Link Prediction via Generalized Coupled Tensor Factorisation
This study deals with the missing link prediction problem: the problem of
predicting the existence of missing connections between entities of interest.
We address link prediction using coupled analysis of relational datasets
represented as heterogeneous data, i.e., datasets in the form of matrices and
higher-order tensors. We propose to use an approach based on probabilistic
interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor
Factorisation, which can simultaneously fit a large class of tensor models to
higher-order tensors/matrices with com- mon latent factors using different loss
functions. Numerical experiments demonstrate that joint analysis of data from
multiple sources via coupled factorisation improves the link prediction
performance and the selection of right loss function and tensor model is
crucial for accurately predicting missing links
PARAFAC2-based Coupled Matrix and Tensor Factorizations
Coupled matrix and tensor factorizations (CMTF) have emerged as an effective
data fusion tool to jointly analyze data sets in the form of matrices and
higher-order tensors. The PARAFAC2 model has shown to be a promising
alternative to the CANDECOMP/PARAFAC (CP) tensor model due to its flexibility
and capability to handle irregular/ragged tensors. While fusion models based on
a PARAFAC2 model coupled with matrix/tensor decompositions have been recently
studied, they are limited in terms of possible regularizations and/or types of
coupling between data sets. In this paper, we propose an algorithmic framework
for fitting PARAFAC2-based CMTF models with the possibility of imposing various
constraints on all modes and linear couplings, using Alternating Optimization
(AO) and the Alternating Direction Method of Multipliers (ADMM). Through
numerical experiments, we demonstrate that the proposed algorithmic approach
accurately recovers the underlying patterns using various constraints and
linear couplings
Semi-Supervised Learning using Differentiable Reasoning
We introduce Differentiable Reasoning (DR), a novel semi-supervised learning
technique which uses relational background knowledge to benefit from unlabeled
data. We apply it to the Semantic Image Interpretation (SII) task and show that
background knowledge provides significant improvement. We find that there is a
strong but interesting imbalance between the contributions of updates from
Modus Ponens (MP) and its logical equivalent Modus Tollens (MT) to the learning
process, suggesting that our approach is very sensitive to a phenomenon called
the Raven Paradox. We propose a solution to overcome this situation
A Time-aware tensor decomposition for tracking evolving patterns
Time-evolving data sets can often be arranged as a higher-order tensor with
one of the modes being the time mode. While tensor factorizations have been
successfully used to capture the underlying patterns in such higher-order data
sets, the temporal aspect is often ignored, allowing for the reordering of time
points. In recent studies, temporal regularizers are incorporated in the time
mode to tackle this issue. Nevertheless, existing approaches still do not allow
underlying patterns to change in time (e.g., spatial changes in the brain,
contextual changes in topics). In this paper, we propose temporal PARAFAC2
(tPARAFAC2): a PARAFAC2-based tensor factorization method with temporal
regularization to extract gradually evolving patterns from temporal data.
Through extensive experiments on synthetic data, we demonstrate that tPARAFAC2
can capture the underlying evolving patterns accurately performing better than
PARAFAC2 and coupled matrix factorization with temporal smoothness
regularization.Comment: 6 pages, 5 figure
Scalable Tensor Factorizations for Incomplete Data
The problem of incomplete data - i.e., data with missing or unknown values -
in multi-way arrays is ubiquitous in biomedical signal processing, network
traffic analysis, bibliometrics, social network analysis, chemometrics,
computer vision, communication networks, etc. We consider the problem of how to
factorize data sets with missing values with the goal of capturing the
underlying latent structure of the data and possibly reconstructing missing
values (i.e., tensor completion). We focus on one of the most well-known tensor
factorizations that captures multi-linear structure, CANDECOMP/PARAFAC (CP). In
the presence of missing data, CP can be formulated as a weighted least squares
problem that models only the known entries. We develop an algorithm called
CP-WOPT (CP Weighted OPTimization) that uses a first-order optimization
approach to solve the weighted least squares problem. Based on extensive
numerical experiments, our algorithm is shown to successfully factorize tensors
with noise and up to 99% missing data. A unique aspect of our approach is that
it scales to sparse large-scale data, e.g., 1000 x 1000 x 1000 with five
million known entries (0.5% dense). We further demonstrate the usefulness of
CP-WOPT on two real-world applications: a novel EEG (electroencephalogram)
application where missing data is frequently encountered due to disconnections
of electrodes and the problem of modeling computer network traffic where data
may be absent due to the expense of the data collection process
Cross-product Penalized Component Analysis (XCAN)
Matrix factorization methods are extensively employed to understand complex
data. In this paper, we introduce the cross-product penalized component
analysis (XCAN), a sparse matrix factorization based on the optimization of a
loss function that allows a trade-off between variance maximization and
structural preservation. The approach is based on previous developments,
notably (i) the Sparse Principal Component Analysis (SPCA) framework based on
the LASSO, (ii) extensions of SPCA to constrain both modes of the
factorization, like co-clustering or the Penalized Matrix Decomposition (PMD),
and (iii) the Group-wise Principal Component Analysis (GPCA) method. The result
is a flexible modeling approach that can be used for data exploration in a
large variety of problems. We demonstrate its use with applications from
different disciplines
Tensor-based fusion of EEG and FMRI to understand neurological changes in Schizophrenia
Neuroimaging modalities such as functional magnetic resonance imaging (fMRI)
and electroencephalography (EEG) provide information about neurological
functions in complementary spatiotemporal resolutions; therefore, fusion of
these modalities is expected to provide better understanding of brain activity.
In this paper, we jointly analyze fMRI and multi-channel EEG signals collected
during an auditory oddball task with the goal of capturing brain activity
patterns that differ between patients with schizophrenia and healthy controls.
Rather than selecting a single electrode or matricizing the third-order tensor
that can be naturally used to represent multi-channel EEG signals, we preserve
the multi-way structure of EEG data and use a coupled matrix and tensor
factorization (CMTF) model to jointly analyze fMRI and EEG signals. Our
analysis reveals that (i) joint analysis of EEG and fMRI using a CMTF model can
capture meaningful temporal and spatial signatures of patterns that behave
differently in patients and controls, and (ii) these differences and the
interpretability of the associated components increase by including multiple
electrodes from frontal, motor and parietal areas, but not necessarily by
including all electrodes in the analysis
Tracing Network Evolution Using the PARAFAC2 Model
Characterizing time-evolving networks is a challenging task, but it is
crucial for understanding the dynamic behavior of complex systems such as the
brain. For instance, how spatial networks of functional connectivity in the
brain evolve during a task is not well-understood. A traditional approach in
neuroimaging data analysis is to make simplifications through the assumption of
static spatial networks. In this paper, without assuming static networks in
time and/or space, we arrange the temporal data as a higher-order tensor and
use a tensor factorization model called PARAFAC2 to capture underlying patterns
(spatial networks) in time-evolving data and their evolution. Numerical
experiments on simulated data demonstrate that PARAFAC2 can successfully reveal
the underlying networks and their dynamics. We also show the promising
performance of the model in terms of tracing the evolution of task-related
functional connectivity in the brain through the analysis of functional
magnetic resonance imaging data.Comment: 5 pages, 5 figures, conferenc
- …